Combination of ligand and structure based virtual screening approaches for the discovery of potential PARP1 inhibitors

Poly (ADP-ribose) polymerase 1 (PARP1) has high therapeutic value as biomolecular target for research and development of small molecules with antineoplastic activity, since it is upregulated in many cancers, especially in ovarian and BRCA 1/2 mutated breast cancers. Decades of investigation of PARP inhibitors (PARPi) have led to the approval of several drug compounds, however clinical application of PARPi in cancer therapy is limited due to a number of factors, including low selectivity, weak affinity and undesired side effects. Thus, identification of novel drug-like chemical compounds with alternatives to the known PARPi chemical scaffolds, binding modes and interaction patterns with amino acid residues in the active site is of high therapeutic importance. In this study we applied a combination of ligand- and structure-based virtual screening approaches with the goal of identification of novel potential PARPi.


Introduction
Poly (ADP-ribose) polymerase 1 (PARP1) has a crucial role in many biological processes, including differentiation, proliferation, transcription, DNA repair and inflammation. PARP1 enzyme has special therapeutic interest since it is upregulated in many cancers, including ovarian, breast, skin, colorectal, lung and others [1][2][3][4][5][6]. The activation of PARP1 is caused by the damage in DNA and facilitates the recruitment of a set of proteins to ensure the reparation process of single-strand DNA breaks. There are four functional domains in the PARP1 enzyme: (N)-terminal DNA-binding domain (DBD), auto-modification domain (AD), WGR domain, and the carboxy (C)-terminal catalytic domain (CAT) [7,8]. CAT has ADP-ribosyl transferase (ART) subdomain, which includes two amino acid residues that are important for NAD + binding (tyrosine and histidine), and also a glutamic acid residue for polymerase activity [8][9][10][11]. The main mechanism of action of PARPi involves their interaction with the catalytic domain of PARP enzyme. The interaction of PARPi with the β-NAD+ binding site of PARP1 leads to the inhibition of its enzymatic activity. Dysfunction of the repair mechanism drives the increase of errors and ultimately results in the cell death [12][13][14][15]. PARPi are the first approved drugs with an antineoplastic activity, whose mechanism of action targets the DNA damage response in BRCA1/2 mutated breast and ovarian cancers [5,2,16]. Around 88 clinical trials of PARP inhibitors listed in clinicaltrials.gov have been completed and several PARPi have been approved for the treatment of BRCA-mutated ovarian, breast and pancreatic cancer. Currently more clinical trials are registered and being conducted on examining the use of PARP inhibitors as an anti-cancer therapy in chemo-resistant germline or somatic BRCA1/2 mutated breast, ovarian, lung, and pancreatic cancers [17]. PARPi have clinically presented promising efficiency against different carcinomas, especially against breast and ovarian cancer [18,19]. Clinical application of PARPi in cancer therapy is limited due to a number of factors, including low selectivity, weak affinity and undesired side effects [20]. Identification of novel drug-like chemical compounds with alternatives to the known PARPi chemical scaffolds, binding modes and interaction patterns with amino acid residues in the active site is of high therapeutic importance. There are currently several FDA-approved PARP inhibitors available, including olaparib (Lynparza, AstraZeneca), Rucaparib (Rubraca, Clovis Oncology Inc.), Niraparib (ZEJULA, Tesaro Inc.) and talazoparib (TALZENNA, Pfizer Inc.). Olaparib was the first PARP inhibitor that entered clinical trials and been approved by FDA and EMA as a monotherapy for BRCA-mutated for the treatment of advanced, germline BRCA mutated ovarian cancer [21]. Niraparib has been approved for the maintenance treatment of reoccurring ovarian, fallopian and primary peritoneal carcinomas [22]. Talazoparib was approved for the treatment of germline BRCA1/2-mutated advanced or metastatic HER2-negative breast cancer in 2018 [23]. These FDA-approved compounds and other investigational compounds (veliparib [24] and fluzoparib [25]) are nonselective PARP1 inhibitors that also inhibit PARP2 with similar potency. Besides catalytic inhibition of PARP protein, other important mechanism of PARPi involves trapping of PARP1 protein on the damaged site of DNA, causes blockade of replication and transcription fork and ultimately leads to death of cancer cells [26]. Among all aforementioned compounds, talazoparib is widely known as the most potent trapping inhibitor, with 100-fold more efficiency in PARP trapping than olaparib. The ability to trap the PARP1 on DNA differs than the catalytic inhibition and is of clinical importance as inhibitory mechanism leading to the death of cancer cells. Therefore, in our study we applied the ligandbased computational methods for the identification of novel potential drug-like chemical scaffolds, alternatives to the known PARPi (talazoparib, in particular) in combination with structure-based methods for binding pose and affinity prediction and analysis of the results.

Preparation of library of compounds and ligand-based virtual screening
The purchasable dataset of ZINC20 [27] (http://zinc20.docking.org/) was used as a library for virtual screening of potential PARP inhibitors (~13.3 million compounds). Dataset was filtered using the PAINS filter [28]. Thereafter the dataset was filtered based on the shape similarity to the talazoparib using OpenEyes scientific ROCS software [29]. ROCS is a powerful ligandbased virtual screening tool which was used for the rapid identification of potentially active compounds by shape comparison to the talazoparib. ROCS performs shape-based overlays, which are based on a description of the molecules as atom-centered gaussian functions, of a candidate molecule to a reference molecule. Two scores was used to evaluate the tested compounds to the reference: 1) the Shape Tanimoto coefficient is used to rank molecules against the query molecule based on their shape similarity and 2) The Color Tanimoto score, that counts appropriate overlap of groups that describe properties, such as H-bond donors and acceptors, cations, anions, rings, etc.

Structure-based virtual screening
Virtual screening was performed using the ICM-PRO software package [30]. ICM-PRO demonstrated high accuracy based on the multiple benchmark studies for molecular docking and virtual screening software among both academic and commercial software [31,32]. The docking algorithm of the ICM-PRO software is based on the Monte Carlo minimization approach [33]. The scoring function of ICM-PRO software is a weighted sum that includes van der Waals energy of the ligand-target interactions, internal force field energy of the ligand, hydrogen bonding interactions, hydrogen bond donor-acceptor desolvation energy, hydrophobic free energy gain and others. The 3D structure of PARP in complex with one of the most potent inhibitors (talazoparib) was downloaded from Protein Data Bank (PDB ID: 7KK3) and used for virtual screening.

Re-scoring using MMGBSA approach
The algorithm for MMGBSA binding energy calculation includes three stages: 1) parametrization of the receptor and ligand, 2) minimization and 3) MMGBSA calculations [34]. General Amber Force Field (GAFF) [35] with AM1-BCC charge model [36] was used for small molecule parametrization, while ff14SB [37] force field is used to describe protein parameters. The algorithm for the binding energies calculations is based on a freely-available AmberTools suite (www.ambermd.org/AmberTools) and the full code is available at the following link: https:// github.com/sahakyanhk/iPBSA.
Clusterization dendrogram and figures of chemical structures were obtained using ICM-PRO. Comparative analysis of physicochemical features of the identified compounds with reference ligand (talazoparib) was performed using OpenEye ROCS's ROCSReport utility (https: //www.eyesopen.com/rocs).

Molecular dynamics simulations
AMBER20 package was used to carry out molecular dynamics simulations [38]. Protein parametrization was performed using the ff14SB force field, while for ligand parametrization GAFF with AM1-BCC charge model was used. Minimized conformations of complexes of PARP1 protein with selected compounds obtained from previous stage (MMGBSA re-scoring) were used as starting positions for corresponding simulations. The complexes were solvated in TIP3P water model and Na+/Cl− ions at 150 mM concentration [39]. The Monte Carlo barostat [40] with reference pressure at 1 bar and Langevin thermostat [41] with collision frequency (gamma_ln) 2 ps −1 were used to keep the temperature at 310.15 K. The Particle Mesh Ewald (PME) method with 1.0 nm cutoff was used for the long-range electrostatic interactions. Each simulation consisted of 5 ns of system minimization and equilibration and 100 ns of conventional molecular dynamics simulation. Finally, for every simulation, binding energies were calculated using MMPBSA.py program [42], using 250 snapshots with equal intervals collected from the last 20 ns of simulation. RMSD was calculated as indicator of stability of studied complexes during simulations. Besides, RMSF analysis was performed to measure the average atomic flexibility of the Cα atoms of the docked complexes.

Results and discussion
10,374,250 compounds remained as the result of filtration of the initial database (~13,276,808 compounds) using PAINS filter. These 10,374,250 compounds were additionally filtered for shape similarity to the reference ligand (talazoparib) using OpenEye's ROCS tool and 500,000 compounds with highest shape similarity to the reference compound were selected for the following molecular docking study.
Molecular docking of the selected 500,000 compounds against the binding site of the catalytic domain of PARP1 enzyme (PDB ID: 7KK3, chain C) was performed using ICM-PRO software. As the result of molecular docking, 168 compounds demonstrated higher docking scores in comparison to the reference ligand (S1 Table).
With the goal of validation of obtained results, binding energies of identified 168 compounds were recalculated using MMGBSA method. MMGBSA re-scoring showed that only 74 compounds demonstrated close or higher binding energies in comparison to talazoparib (Fig 1).
These 74 compounds can be divided into 10 clusters based on their chemical similarity ( Fig  2).
Similarity analysis of the four selected compounds (representative compounds of clusters with the highest estimated binding energies) and the reference drug-compound (talazoparib) demonstrated significant differences in both chemical structure and functional groups (Fig 3). where SER 904 and GLY863, again, are common interacting residues for both compounds. ZINC40467580 forms 6 conventional bonds with 5 amino acid residues: GLY 863, SER 864, ARG 878 and SER 904, while ZINC170624334 forms 5 conventional hydrogen bonds with GLY 863, MET 890, SER 904 and GLU 988. Again, only SER 904 and GLY 863 are common interacting amino acid residues of PARP1 involved in hydrogen bonds formation for ZINC40467580, ZINC170624334 and reference drug compound talazoparib. Thus, all five studied compounds, including reference drug compound talazoparib form conventional hydrogen bonds with the SER 904 and GLY 863. Formation of hydrogen bonds with ASP 770 and GLU 988 is unique for the ZINC6547121 and ZINC170624334 respectively. ZINC6547121 and ZINC40467580 both form conventional hydrogen bonds with ARG 878. ZINC6547121 and ZINC170624334 interact with the MET 890.
To obtain information on stability of interactions and binding tendency of selected compounds and reference ligand, 5 molecular dynamics simulations were performed. Based on the RMSD values of studied compounds (<0.2 nm) and RMSF values of PARP1 amino acid residues during the performed molecular dynamics simulations, all studied interactions demonstrated stability (Fig 5).
Binding free energies of studied interactions were recalculated using MMPBSA approach, which is relatively more reliable and accurate method in comparison to the MMGBSA approach, based on the trajectories obtained from performed molecular dynamics simulations. Based on the "delta total" values (Fig 6), which is final estimated binding free energy calculated from other presented energetic terms, compounds ZINC40467580 (-35.83 kcal/mol) and ZINC11692074 (-35.29 kcal/mol) had lower binding energies than reference ligand talazoparib (-30.55), while other two compounds ZINC170624334 (-26.72 kcal/mol) and ZINC6547121 (-25.28 kcal/mol) demonstrated higher binding energies. Remarkably all studied compounds in comparison to talazoparib demonstrated relatively lower values of electrostatic energy and van der Waals forces (Fig 6). However due to much higher electrostatic contribution of the studied compounds to the solvation free energy calculated by PB, overall "delta total" energy values of studied compounds are close to talazoparib. Differences in contribution of studied energetic terms to interaction of studied compounds, highlights additional interest and potential of selected compounds from drug design point of view.

Conclusion
As the result of molecular docking and MMGBSA re-scoring experiments several chemical compounds with close or higher binding energies to the PARPi active site have been identified. The 2D chemical similarity analysis showed that the identified compounds include alternative to talazoparib chemical components and scaffolds. Differences in the interaction patterns of these compounds and talazoparib with amino acid residues of the active site of the catalytic domain of PARP1 enzyme indicate that these compounds might potentially have different therapeutically valuable properties. These compounds are of great interest for their further research as potential inhibitors of the PARP1 enzyme.
Supporting information S1 Table. List of compounds with higher docking scores than talazoparib.